| Date | Time | Location | Topic |
|---|---|---|---|
| Wednesday, 19-02-2025 | 13.30-16.30 | Van Steenis room E0.02A | Welcome Intro to R/RStudio Project organisation |
| Friday, 21-02-2025 | 13.30-16.30 | Van Steenis room E0.02A | Cleaning data Intro to Statistics |
| Monday, 24-02-2025 | 13.30-16.30 | Van Steenis room E0.02B | Visualising data |
| Friday, 28-02-2025 | 13.30-16.30 | Van Steenis room A2.02A (Corrie Bakelzaal) | Visualising data Transforming data |
| Monday, 03-03-2025 | 13.30-16.30 | Van Steenis room E0.02B | Transforming data Modelling data |
| Wednesday, 05-03-2025 | 13.00-17.00 | Van Steenis room E0.02B | Communicating data |
Pronounced /’Arrrgh/
GIPHY
Because it’s the best!
End of presentation.
R is a free and open source software environment for statistical computing and graphics
There are 20000+ available packages on CRAN
The R community is pretty cool
Seems to be the most popular
RStudio is an integrated development environment (IDE) specifically for R
It provides a bunch of extra features to make using R a delight!
The tidyverse is a collection of R packages sharing the same data science philosophy
It provides a nice workflow for cleaning, visualising, and transforming data
Aspects of ‘base R’ will also be covered
It is not enough to cover all important topics.
It is enough to teach you how to find answers and implement them yourself.
Sheep astragulus morphology from Iron Age Eastern Mediterranean.
nmar79. (2023). nmar79/Med_Sheep_Astragals: v0.1 (v0.1). Zenodo. https://doi.org/10.5281/zenodo.10276147
Burial data from northeastern Taiwan ranging from the Iron Age through the European colonization period.
Li-Ying Wang & Ben Marwick, (2021). Compendium of R code and data for “A Bayesian networks approach to infer social changes from burials in northeastern Taiwan during the European colonization period”. Accessed 23 Aug 2021. Online at https://osf.io/xga6n/
Assignment 1 consists of finding, importing, cleaning, and exploring/analysing a dataset.
Find a dataset that you want to work with, then use a script to download and clean the data.
If you can’t find a dataset, you can use the full, unmodified version of the workshop data here: https://osf.io/zem9p
If the EDA module is included in the workshop, then assignment 1 is extended to include EDA. Plot the distribution of at least two types of variables that you are interested in exploring further.
Create at least one plot with the relationship between two variables, and a summary table with the mean and standard deviation within groupings of a variable.
If the data modelling module is taught in the workshop, select at least two statistical models to apply to the data.
Use Quarto to communicate the results from the previous sections of assignment 1. This can be in the form of a report, short manuscript, presentation, or whatever format in Quarto you prefer.
You will be paired up with another person in the workshop. Send each other your project and try to run the other’s code.
Make a note of any issues you encounter, and provide feedback in a document.
Incorporate feedback.